Bootstrapping a System for Phoneme Recognition and Keyword Spotting in Unaccompanied Singing

نویسنده

Anna M. Kruspe

چکیده

Speech recognition in singing is still a largely unsolved problem. Acoustic models trained on speech usually produce unsatisfactory results when used for phoneme recognition in singing. On the flipside, there is no phonetically annotated singing data set that could be used to train more accurate acoustic models for this task. In this paper, we attempt to solve this problem using the DAMP data set which contains a large number of recordings of amateur singing in good quality. We first align them to the matching textual lyrics using an acoustic model trained on speech. We then use the resulting phoneme alignment to train new acoustic models using only subsets of the DAMP singing data. These models are then tested for phoneme recognition and, on top of that, keyword spotting. Evaluation is performed for different subsets of DAMP and for an unrelated set of the vocal tracks of commercial pop songs. Results are compared to those obtained with acoustic models trained on the TIMIT speech data set and on a version of TIMIT augmented for singing. Our new approach shows significant improvements over both.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword Spotting in A-capella Singing

Keyword spotting (or spoken term detection) is an interesting task in Music Information Retrieval that can be applied to a number of problems. Its purposes include topical search and improvements for genre classification. Keyword spotting is a well-researched task on pure speech, but state-of-the-art approaches cannot be easily transferred to singing because phoneme durations have much higher v...

متن کامل

Retrieval of Textual Song Lyrics from Sung Inputs

Retrieving the lyrics of a sung recording from a database of text documents is a research topic that has not received attention so far. Such a retrieval system has many practical applications, e.g. for karaoke applications or for indexing large song databases by their lyric content. In this paper, we present such a lyrics retrieval system. In a first step, phoneme posteriorgrams are extracted f...

متن کامل

Noise Robust Keyword Spotting Using Deep Neural Networks For Embedded Platforms

The recent development of embedded platforms along with spectacular growth in communication networking technologies is driving the Internet of things to thrive. More complex tasks are now possible to operate in small devices such as speech recognition and keyword spotting which are in great demand. Traditional voice recognition approaches are already being used in several embedded applications,...

متن کامل

Comparison of keyword spotting approaches for informal continuous speech

This paper describes several approaches to keyword spotting (KWS) for informal continuous speech. We compare acoustic keyword spotting, spotting in word lattices generated by large vocabulary continuous speech recognition and a hybrid approach making use of phoneme lattices generated by a phoneme recognizer. The systems are compared on carefully defined test data extracted from ICSI meeting dat...

متن کامل

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

This paper describes several ways of keywords spotting (KWS), based on Gaussian mixture (GM) hidden Markov modelling (HMM). Context-independent and dependent phoneme models are used in our system. The system was trained and evaluated on informal continuous speech. We used different complexities of KWS recognition networks and different types of phoneme models. The impact of these parameters on ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Bootstrapping a System for Phoneme Recognition and Keyword Spotting in Unaccompanied Singing

نویسنده

چکیده

منابع مشابه

Keyword Spotting in A-capella Singing

Retrieval of Textual Song Lyrics from Sung Inputs

Noise Robust Keyword Spotting Using Deep Neural Networks For Embedded Platforms

Comparison of keyword spotting approaches for informal continuous speech

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

عنوان ژورنال:

اشتراک گذاری